Classification  of  car  in  lane  using  support  vector  machines 


Michael  Del  Rose,  David  Gorsich,  Robert  Karlsen 
Tank  Automotive  Research,  Development,  and  Engineering  Center,  Warren,  MI  48397 


ABSTRACT 

Support  Vector  Machines  (SVMs)  have  become  popular  due  to  their  accuracy  in  classifying  sparse  data 
sets.  Their  computational  time  can  be  virtually  independent  of  the  size  of  the  feature  vector.  SVMs  have 
been  shown  to  out  perform  other  learning  machines  on  many  data  sets.  In  this  paper,  we  use  SVMs  to 
detect  a  car  in  a  lane  of  traffic.  Digital  pictures  of  various  driving  situations  are  used.  The  results  from  the 
SVM  algorithm  are  compared  to  results  from  a  standard  neural  network  approach. 

Keywords:  Support  vector  machines,  neural  network,  image  processing,  pattern  recognition 


1.  INTRODUCTION 

Support  vector  machines  (SVMs)  are  wide  margin  classifies  that  solve  a  quadratic  programming  problem  to 
find  the  maximum  separation  between  classes1'4.  The  algorithm  is  applied  to  the  Tank  Automotive 
Research,  Development,  and  Engineering  Center  (TARDEC)  car/lane  image  set.  The  image  set  was 
obtained  using  a  digital  camera  mounted  on  a  vehicle  dash.  The  images  are  composed  of  views  of  either  a 
clear  road  ahead  or  a  car  in  front  of  the  camera  at  various  distances.  A  simulation  experiment  is  performed 
to  determine  how  well  SVMs  can  do  in  warning  a  driver  when  a  vehicle  is  in  front  of  the  car.  The  SVM 
results  are  compared  with  results  from  a  standard  neural  network  approach. 

Section  2  describes  the  different  methods  used  to  process  the  images  to  find  a  good  feature  vector.  Section 
3  gives  the  results  of  the  study.  Section  4  describes  other  methods  that  might  warrant  further  investigation 
into  solving  this  problem.  For  detailed  information  on  SVMs  or  neural  networks,  the  reader  is  advised  to 
consult  the  references. 


2.  IMAGE  PRE-PROCESSING  METHODS 


Various  techniques  were  investigated  to  find  a  feature  vector  that  described  the  data  set.  Methods  such  as 
wavelets,  masks,  and  histograms  were  explored  with  some  having  success  and  others  not.  This  section 
describes  the  thoughts  behind  the  investigation  and  the  results  that  the  methods  gave. 

Data  was  collected  using  several  different  digital  cameras  mounted  on  the  dash  of  various  cars.  The  images 
were  colored  with  sizes  ranging  from  1280x1024  to  640x512.  Pictures  were  taken  of  common  road  surfaces 
(dirt,  highway,  freeway,  etc.)  with  either  cars  at  different  distances  or  no  cars.  Before  processing,  each 
image  was  converted  to  grayscale  and  resized  using  the  nearest  neighbor  algorithm5  to  a  standard  size.  The 
investigation  into  creating  a  good  feature  vector  was  centered  on  finding  edges  of  the  cars.  Unfortunately, 
isolating  the  edges  around  the  car  turned  out  to  be  difficult  due  to  other  objects  in  the  pictures  having  more 
prominent  features. 


Report  Documentation  Page 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 


1.  REPORT  DATE 

30  JUN  2000 


2.  REPORT  TYPE 

Journal  Article 


4.  TITLE  AND  SUBTITLE 

Classification  of  car  in  lane  using  support  vector  machines 


6.  AUTHOR(S) 

Michael  Del  Rose;  David  Gorsich;  Robert  Karlsen 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

U.S.  Army  TARDEC,6501  East  Eleven  Mile  Rd, Warren, Mi, 48397-5000 


3.  DATES  COVERED 

17-04-2000  to  23-05-2000 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROIECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

#14235 


9.  SPONSORING/MONITORING  AGENCY  NAME(S )  AND  ADDRESS(ES )  10.  SPONSOR/MONITOR' S  ACRONYM(S) 

U.S.  Army  TARDEC,  6501  East  Eleven  Mile  Rd,  Warren,  Mi,  48397-5000  TARDEC 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

#14235 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

Support  Vector  Machines  (SVMs)  have  become  popular  due  to  their  accuracy  in  classifying  sparse  data 
sets.  Their  computational  time  can  be  virtually  independent  of  the  size  of  the  feature  vector.  SVMs  have 
been  shown  to  out  perform  other  learning  machines  on  many  data  sets.  In  this  paper,  we  use  SVMs  to 
detect  a  car  in  a  lane  of  traffic.  Digital  pictures  of  various  driving  situations  are  used.  The  results  from  the 
SVM  algorithm  are  compared  to  results  from  a  standard  neural  network  approach. 

15.  SUBIECT  TERMS 

Support  vector  machines,  neural  network,  image  processing,  pattern  recognition 


16.  SECURITY  CLASSIFICATION  OF: 


a.  REPORT 

unclassified 


b.  ABSTRACT 

unclassified 


c.  THIS  PAGE 

unclassified 


17.  LIMITATION  OF 

18.  NUMBER 

ABSTRACT 

OF  PAGES 

Public  Release 

7 

19a.  NAME  OF 
RESPONSIBLE  PERSON 


Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Figure  la  -  Image 


Figure  lb  -  Sobel  mask  with 
threshold  =  .2 


J  1  k  in 

4'  “v 

. 

s  4'$..  '  _ 

«*• 

•  f.r'  .•  ■  .  i 

J  1  — - 

■ "  • 

1.  V 

1 

1  // 

Figure  lc  - 

Sobel  mask  with 

threshold  =  .4 

Figure  Id  -  Sobel  mask  with 
threshold  =  .6 


Figure  1  -  original  image  grayed  and  resized  (a).  Sobel  mask  taken  of  images  and  thresholded  (b,  c,  d). 


Figure  la  is  an  example  of  a  typical  grayscale  image.  A  Sobel  mask5'7  is  applied  and  the  results  are 
thresholded  to  find  the  edges  (Figures  lb  -  Id).  Figure  1  shows  that  increasing  the  threshold  results  in 
decreasing  the  edges  of  the  car.  To  reduce  noise  in  the  image  without  losing  too  much  of  the  car  outline, 
another  algorithm  must  be  used. 

Applying  the  wavelet  transform8'10  to  the  image  (or  the  edges  of  the  image)  works  well  if  the  intent  is  to  use 
the  approximation  from  the  wavelet  transform  to  resize  it.  However,  local  statistics  from  the  wavelet 
transform  (such  as  mean  and  energy)  do  not  increase  the  classification  rate.  Instead,  a  look  at  the  data 
shows  that  the  center  region  is  the  place  where  there  is  a  car  (for  pictures  that  had  cars).  We  can  reduce  the 
image  by  cutting  out  the  middle  128x128  section.  Doing  so  reduces  the  unwanted  edges  caused  by 
background  objects.  However,  this  method  often  cuts  out  edges  of  vehicles  that  are  too  close  or  off-center. 

Another  processing  technique  that  was  investigated  was  to  use  different  size  boxes  around  the  center,  called 
the  box-in-box  method".  Box  sizes  were  16x16,  32x32,  64x64,  128x128,  and  256x256.  These  boxes 


contained  the  Sobel  edges  of  the  image  in  that  region.  The  goal  was  to  encompass  the  full  car  outline 
while  reducing  the  added  noise  of  the  edges  due  to  the  background.  Each  box  represented  a  distance  the 
car  was  away  from  the  camera  (for  images  with  cars).  A  number  of  boxes  were  trained  (on  both  car  and  no 
car  images)  and  tested.  If  one  of  the  boxes  showed  that  a  car  outline  was  present  then  the  classifier 
classified  the  image  as  having  a  car  in  front.  Unfortunately,  this  method  did  not  classify  well  due  to 
different  car  styles,  patches  or  glares  in  the  roads,  off-centered  cars,  bridges,  etc. 

The  final  feature  vector  was  developed  using  a  few  of  the  techniques  described  above,  along  with  an 
algorithm  to  find  horizontal  lines12.  Since  the  edges  of  each  car  are  not  consistent  with  one  another,  a  new 
way  to  view  the  images  needed  to  be  looked  at.  The  car  edges  contain  a  number  of  horizontal  lines  coming 
from  the  bumper,  rear  window,  top,  bottom,  etc.  When  looking  at  edges  in  no  car  images,  the  horizon  and 
tree  lines  also  produced  horizontal  lines,  so  they  had  to  be  sectioned  off. 

A  Sobel  mask  was  applied  to  find  the  edges  in  the  center  128x128  area  of  the  images.  After  this,  an 
algorithm  was  applied  to  find  consecutive  horizontal  pixels  at  least  6  pixels  in  length.  Examples  of  the 
processing  are  shown  in  Figure  2  (for  a  car)  and  Figure  3  (for  no  car). 


Figure  2a  -  image  of  a  car 


Figure  2b  -  center  regions  horizontal  lines  from  edges 


Figure  2  -  image  of  a  car  (a)  and  the  center  horizontal  lines  (b) 


Figure  3a  -  image  of  no  car 


Figure  3  -  Image  of  no  car  (a)  and  the  center  regions  horizontal  lines  (b) 


From  this,  the  length  of  the  lines  were  grouped  (based  on  the  number  of  consecutive  pixels)  and  a 
histogram  was  formed12.  The  line  groups  were  6-8,  9-11,  12-14,  15-17,  18-20,  21+.  These  six  numbers 
along  with  the  total  number  of  horizontal  lines  (at  least  6  pixels  long)  were  used  as  the  final  feature  vector. 


3.  RESULTS 


There  were  a  total  of  218  images  with  89  car  images  and  129  no  car  images.  Each  image  was  represented 
by  seven  elements  (see  above).  The  feature  data  was  split  up  into  training  and  test  vectors  and  put  into  a 
support  vector  machine  (SVM)3  and  a  standard  neural  network  (NN)1'.  The  SVM  used  a  quadratic 
polynomial  kernel.  The  NN  has  two  layers  with  the  hidden  layer  having  either  three  or  five  neurons  and 
the  outer  layer  having  one  neuron.  The  activation  function  for  all  neurons  is  a  unipolar  sigmoid  function. 
The  results  are  shown  in  the  Tables  1  and  2  below: 


Trained  50  car  and  50  no  car: 

SVM  -  polv  2  kernel 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

4 

75 

Table  la 

NN  -3  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

6 

73 

Table  lb 

NN  -  5  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

6 

73 

Table  lc 

Table  1  -  SVM  and  NN  classification  matrix  for  100  training  samples 


Calculating  the  classification  rate  for  the  SVM  of  Table  la,  we  see  that  it  chooses  the  correct  class  95.8% 
of  the  time.  Both  NNs  are  the  same  with  a  classification  rate  of  94.9%  (see  Tables  lb  and  lc).  For  Table 
2,  the  number  of  training  vectors  increased  from  50  per  class  to  64  per  class.  The  results  show  that  the 
SVM  classification  rate  increased  to  96.7%.  The  NN  classification  rates  were  93.3%  and  94.4%  for  the 
three  and  five  neuron  networks,  respectively. 


Trained  64  car  and  64  no  car: 

SVM  -  poly  2  kernel 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

1 

64 

Table  la 

NN  -3  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

3 

62 

Table  lb 

NN  -  5  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

4 

61 

Table  lc 

Table  2  -  SVM  and  NN  classification  matrix  for  128  training  samples 


The  results  from  this  study  have  shown  an  example  of  SVMs  outperforming  NN  with  a  small  data  set.  The 
SVM  classification  rates  were  slightly  higher  then  the  NNs  rates  for  100  training  samples.  The  SVMs 
classification  rate  increased  as  the  number  of  training  samples  increased.  However,  the  NNs  classification 
rates  stayed  the  same  or  decreased  as  the  training  samples  increased. 


4.  FURTHER  STUDY 


Using  histograms  of  groups  of  horizontal  lines  seems  to  tell  us  that  something  is  in  the  path  of  the  vehicle. 
Problems  will  arise  if  the  vehicle’s  roll  position  is  different  then  the  one  taking  the  pictures;  horizontal  lines 
will  not  be  horizontal.  A  method  to  find  parallel  lines  rather  than  horizontal  lines  should  be  employed. 
Using  templates  is  another  idea  that  could  prove  fruitful  if  it  were  to  be  used  with  the  box-in-box  method. 
The  templates  would  be  used  for  training  (at  different  distances)  and  the  images  would  be  used  for  testing. 
Other  paths  of  investigation  should  include  better  algorithms  for  denoising  the  data.  The  goal  would  be  to 
remove  all  edges  but  that  of  the  car.  Unfortunately,  the  car  is  not  always  the  most  prominent  feature  in  the 
image. 


REFERENCES 


1 .  B.  Scholkopf,  C.  Burges,  A.  Smola,  Advances  in  Kernel  Methods ,  MIT  Press,  Cambridge,  1999. 

2.  V.  Vapnik,  Statistical  Learning  Theory,  John  Wiley  and  Sons,  New  York,  1998. 

3.  S.  Gunn,  Support  Vector  Machines  for  Classification  and  Regression,  Technical  Report,  Image  Speech 
and  Intelligent  Systems  Group,  Univ.  of  Southhampton,  1998. 

4.  R.  Karlsen,  D.  Gorsich,  G.  Gerhart,  “Target  Classification  Via  Support  Vector  Machines,”  Optical 
Engineering,  39:704-71 1, 2000. 

5.  H.  Burdick,  Digital  Imaging,  Theory  and  Application,  McGraw-Hill,  New  York,  1997. 

6.  G.  Baxes,  Digital  Image  Processing,  Principles  and  Applications,  John  Wiley  and  Sons,  New  York, 
1994. 

7.  R.  Crane,  A  Simplified  Approach  to  Image  Processing,  Prentice  Hall,  Upper  Saddle  River,  1997. 

8.  G.  Strang,  T.  Nguyen,  Wavelets  and  Filter  Banks,  Wellesley-Cambridge  Press,  Wellesley,  1997. 

9.  R.  Rao,  A.  Bopardikar,  Wavelets  Transforms,  Introduction  to  Theory  and  Applications,  Addison- 
Wesley,  Reading,  1998. 

10.  M.  Del  Rose,  A  Wavelet  Processing  Method  for  Signals,  Technical  Report,  University  of  Michigan, 
1999. 

11.  M.  Del  Rose,  Image  Processing  Methods,  Technical  Report,  University  Of  Michigan,  1999. 

12.  M.  Del  Rose,  Hand  Written  Character  Recognition,  Masters  Thesis,  University  of  Michigan,  2000. 

13.  C.  Bishop,  Neural  Networks  for  Pattern  Recognition,  Oxford  University  Press,  New  York,  1995. 


